State analysis of remote computing images

ABSTRACT

Systems and methods for analyzing a computing image (e.g., container image, virtual disk image) while it is on a remote node in a secured environment. An example method may comprise: initiating a proxy agent on a node, the proxy agent having access to an image repository comprising an image; transmitting to the proxy agent a request for image data of the image; receiving the image data from the proxy agent; and analyzing the image data to determine a state of the image.

TECHNICAL FIELD

The present disclosure is generally related to analyzing the state of acomputing image, and is more specifically related to analyzing acomputing image while it is on a remote node in a secured environment.

BACKGROUND

The virtualization of a data center results in a physical system beingvirtualized using virtual machines to consolidate the data centerinfrastructure and increase operational efficiencies. A virtual machine(VM) may be an emulation of computer hardware. For example, the VM mayoperate based on computer architecture and functions of computerhardware resources associated with hard disks or other such memory. TheVM may emulate a physical computing environment, but requests for a harddisk or memory may be managed by a virtualization layer of a hostmachine to translate these requests to the underlying physical computinghardware resources. This type of virtualization results in multiplevirtual machines sharing physical resources.

The physical systems and virtual systems (e.g., virtual machines) mayeach have a state that can be persisted and stored as an image. Thestate of a physical system may be stored as a hard disk image and thehard disk image may be used to configure one or more other physicalsystems. The state of a virtual machine may be stored as a virtualmachine disk image and may be used to run one or more virtual machines.Either type of image may contain the state of the system at a point intime and may be updated to reflect changes that occur to the system overtime.

BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure is illustrated by way of examples, and not by wayof limitation, and may be more fully understood with references to thefollowing detailed description when considered in connection with thefigures, in which:

FIG. 1 depicts a high-level block diagram of an example distributedsystem operating in accordance with one or more aspects of the presentdisclosure;

FIG. 2 depicts a block diagram of an example system operating inaccordance with one or more aspects of the present disclosure;

FIG. 3 depicts a flow diagram of an example method for analyzing acomputing image using a proxy agent, in accordance with one or moreaspects of the present disclosure;

FIG. 4 depicts a flow diagram of another example method for analyzing acomputing image using a proxy agent, in accordance with one or moreaspects of the present disclosure;

FIG. 5 depicts a block diagram of an illustrative computing deviceoperating in accordance with the examples of the present disclosure.

DETAILED DESCRIPTION

Many modern computing systems enable a state of a computing system to bestored as an image. The image may include confidential or proprietaryexecutables and configuration information that may be loaded onto a nodeand executed by the node to provide a computing service. Due to theconfidential nature of the images, the images may be stored on an imagerepository with restricted access to enhance security. For example, theimage repository may be accessible to a node executing the image but maynot be accessible to a management server. The management server maymanage the one or more nodes in the computing environment and maydetermine which nodes execute which images. The management server maybase its operations on the state of the images and the enhanced securitymay adversely affect the ability of the virtualization server todetermine the state of the images.

Aspects of the present disclosure address the above and otherdeficiencies by providing technology for determining the state of animage stored on a remote node when direct access to the image is limited(e.g., no read access). In one example, a computing device may managethe operations of one or more nodes that include physical machines,virtual machines, or a combination thereof and each of the nodes may becapable of accessing and executing images from an image repository. Thecomputing device may have access to the nodes but may not have directaccess to the image repository because of security measures in place toprotect the images. To gather information about the images the computingdevice may initiate a proxy agent on a node that has access to an imagerepository. In one example, the node may provide operating systemvirtualization that supports one or more containers and the executableproxy agent may run within a container. The container may be aresource-constrained process space of the node that can executefunctionality of the proxy agent in a secure manner. Once the proxyagent is running on the node, the computing device may transmit one ormore requests to the proxy agent. The requests may include operations orinstructions for gathering image data from the image. In one example,the requests may be in the form of network requests (e.g., HTTPrequests) that include operations for searching the image for particularimage data that represents the state of the image (e.g., operatingsystem features, running programs, hardware architecture). The computingdevice may then receive, store, and analyze the image data to determinethe state of the image.

The systems and methods described herein include technology thatenhances the performance of image state analysis. In particular, aspectsof the present disclosure may enable a computing device to analyze animage in a secured environment without accessing the image directly ordownloading the image locally. Aspects of the present disclosure mayalso enable the image analysis logic to be centralized (e.g.,consolidated). Centralizing the image analysis logic on the computingdevice may enable the image analysis logic to be more easily updated orreplaced to detect additional or different states of an image. Aspectsof the present disclosure may also enable a computing device to analyzean image in a more resources efficient manner. Images are often storedas large files (e.g., multiple gigabytes) and remotely analyzing theimage may reduce the amount of data being retrieved from the image andtransmitted to the computing device. This may reduce the amount ofnetworking resources, input/output (I/O) resources, processing, memory,or other computing resources consumed to analyze the state of an image.Aspects of the present disclosure may also enable the data retrievedfrom the image to be stored and re-analyzed when the image analysislogic is updated (e.g., update to detect new flaw). The subsequentanalysis may be done long after the image data was retrieved withoutre-retrieving or re-transmitting the data from the remote image.

Various aspects of the above referenced methods and systems aredescribed in details herein below by way of examples, rather than by wayof limitation. The examples provided below discuss a virtualizedenvironment, but other examples may include a standard operating systemrunning on an individual computing device without virtualization (e.g.,without a hypervisor).

FIG. 1 illustrates an example distributed system 100 in whichimplementations of the disclosure may operate. The distributed system100 may include a manager 110, a node 120, and an image repository 130coupled via one or more networks 140 and 150. Networks 140 and 150 maybe public networks (e.g., the internet), a private networks (e.g., alocal area network (LAN) or wide area network (WAN)), or a combinationthereof. In one example, network 140 and 150 may be similar but network150 may be more secured and restrict access to a subset of the devices(e.g., node 120 and image repository 130). Networks 140 and 150 mayinclude a wired or a wireless infrastructure, which may be provided byone or more wireless communications systems, such as a wireless fidelity(WiFi) hotspot connected with the network 140 and/or a wireless carriersystem that can be implemented using various data processing equipment,communication towers, etc.

Manager 110 may be hosted by a computing device and include one or morecomputer programs executed by the computing device for centralizedmanagement of distributed system 100. In one implementation, the manager110 may comprise various interfaces, including administrative interface,reporting interface, and/or application programming interface (API) tocommunicate with node 120, as well as to user portals, databases,directory servers and various other components, which are omitted fromFIG. 1 for clarity. Manager 110 may include an image analysis component112 that interacts with one or more nodes 120 to determine the state ofan image on image repository 130.

Node 120 may comprise a computing device with one or more processorscommunicatively coupled to memory devices and input/output (I/O)devices, as described in more details herein below with references toFIG. 5. Node 120 may include an operating system 122 with one or moreuser space programs. Operating system 122 may be any program orcombination of programs that are capable of using the underlyingcomputing device to perform computing tasks. Operating system 122 mayinclude a kernel comprising one or more kernel space programs (e.g.,memory driver, network driver, file system driver) for interacting withvirtual hardware devices or actual hardware devices (e.g.,para-virtualization). User space programs may include programs that arecapable of being executed by operating system 122 and in one example maybe an application program for interacting with a user. Although node 120comprises a computing device, the term “node” may refer to the computingdevice (e.g., physical machine), a virtual machine, or a combinationthereof.

Node 120 may provide one or more levels of virtualization such ashardware level virtualization, operating system level virtualization,other virtualization, or a combination thereof. Node 120 may providehardware level virtualization by running a hypervisor that provideshardware resources to one or more virtual machines. The hypervisor maybe any program or combination of programs and may run on a hostoperating system or may run directly on the hardware (e.g., bare-metalhypervisor). The hypervisor may manage and monitor various aspects ofthe operation of the computing device, including the storage, memory,and network interfaces. The hypervisor may abstract the physical layerfeatures such as processors, memory, and I/O devices, and present thisabstraction as virtual devices to a virtual machine.

Node 120 (e.g., physical machine or virtual machine) may also oralternatively provide operating system level virtualization by running acomputer program that provides computing resources to one or morecontainers 124A-C. Operating system level virtualization may beimplemented within the kernel of operating system 122 and may enable theexistence of multiple isolated containers. In one example, operatingsystem level virtualization may not require hardware support and mayimpose little to no overhead because programs within each of thecontainers may use the system calls of the same underlying operatingsystem 122. This enables node 120 to provide virtualization without theneed to provide hardware emulation or be run in an intermediate virtualmachine as may occur with hardware level virtualization.

Operating system level virtualization may provide resource managementfeatures that isolate or limit the impact of one container (e.g.,container 124A) on the resources of another container (e.g., container124B or 124C). The operating system level virtualization may provide apool of resources that are accessible by container 124A and are isolatedfrom one or more other containers (e.g., container 124B). The pool ofresources may include file system resources (e.g., particular volume),network resources (e.g., particular network address), memory resources(e.g., particular memory portions), other computing resources, or acombination thereof. The operating system level virtualization may alsolimit a container's access to one or more computing resources bymonitoring the containers activity and restricting the activity in viewof one or more limits (e.g., quotas). The limits may restrict the rateof the activity, the aggregate amount of the activity, or a combinationthereof. The limits may include one or more of disk limits, input/out(I/O) limits, memory limits, CPU limits, network limits, other limits,or a combination thereof. In one example, an operating systemvirtualizer provides the computing resources to containers 124A-C. Theoperating system virtualizer may wrap an application in a complete filesystem that contains the code, runtime, system tools, system librariesand other programs installed on the node that can be used by theapplication. In one example, the operating system virtualizer may be thesame or similar to Docker for Linux®, ThinApp® by VMWare®, SolarisZones® by Oracle®, or other program that automates the packaging,deployment, and execution of applications inside containers.

Each of the containers 124A-C may refer to a resource-constrainedprocess space of node 120 that can execute functionality of a program.Containers 124A-C may be referred to as a user-space instances, avirtualization engines (VE), or jails and may appear to a user as astandalone instance of the user space of operating system 122. Each ofthe containers 124A-C may share the same kernel but may be constrainedto only use a defined set of computing resources (e.g., CPU, memory,I/O). Aspects of the disclosure can create one or more containers tohost a framework or provide other functionality of an application (e.g.,proxy agent functionality, database functionality, web applicationfunctionality, etc.) and may therefore be referred to as “applicationcontainers.”

Pods 126A and 126B may be data structures that are used to organize oneor more containers 124A-C and enhance sharing between containers, whichmay reduce the level of isolation between containers within the samepod. Each pod may include one or more containers that share computingresources with another container associated with the pod. Each pod maybe associated with a unique identifier, which may be a networkingaddress (e.g., IP address), that allows applications to use portswithout a risk of conflict. A pod may be associated with a pool ofresources and may define a volume, such as a local disk directory or anetwork disk and may expose the volume to one or more (e.g., all) of thecontainers within the pod. In one example, all of the containersassociated with a particular pod may be co-located on the same node 120.In another example, the containers associated with a particular pod maybe located on different nodes that are on the same or different physicalmachines.

Image repository 130 may be any data store that is capable of storingone or more images 132A-C and being accessed by node 120. Node 120 mayremotely access image repository 130 over network 150 or may locallyaccess image repository using a direct connection (e.g., not over anetwork). Image repository 130 may be stored on a data storage devicethat includes block-based storage devices, file-based storage devices,or a combination thereof. Block-based storage devices may include one ormore data storage devices (e.g., Storage Area Network (SAN) devices) andmay provide access to consolidated block-based (e.g., block-level) datastorage. Block-based storage devices may be accessible over a networkand may appear to an operating system of a computing device as locallyattached storage. File-based storage devices may include one or moredata storage devices (e.g., Network Attached Storage (NAS) devices) andprovide access to consolidated file-based (e.g., file-level) datastorage that may be accessible over a network. Image repository 130 mayinclude images 132A-C, storage metadata, and storage lease information.In one example, a secondary storage with image repository 130 may employblock-based storage and images 132A-C, storage metadata, and storagelease may be provided by respective logical volumes. In another example,the secondary storage with image repository 130 may employ file-basedstorage and images 132A-C, storage metadata, and storage lease may beprovided by one or more respective files.

Images 132A-C may be any data structure for storing and organizinginformation that may be used by node 120 to provide a computing service.The information within images 132A-C may indicate the state of the imageand may include executable information (e.g., machine code),configuration information (e.g., settings), or content information(e.g., file data, record data). Each of the images 132A-C may be capableof being loaded onto node 120 and may be executed to perform one or morecomputing tasks. Images 132A-C may be container images, virtual machineimages, disk images, other images, or a combination thereof. A containerimage may include a user space program (e.g., application) along with afile system that contains the executable code, runtime, system tools,system libraries and other programs to support the execution of the userspace program on node 120. The container image may not include anoperating system but may be run by an operating system virtualizer thatis part of an existing operating system of node 120. A virtual machineimage may include both an operating system and one or more user spaceprograms. The virtual machine image may be loaded onto node 120 and maybe run by a hypervisor. A disk image may be the same or similar to avirtual machine image (e.g., virtual disk image) but may be loaded ontonode 120 and run without using a hypervisor or other form ofvirtualization technology. In one example, an image may be generated bycreating a sector-by-sector copy of a source medium (e.g., hard drive ofexample machine). In another example, a disk image may be generatedbased on an existing image and may be manipulated before, during, orafter being loaded and executed. The format of images 132A-C may bebased on any open standard, such as the ISO image format for opticaldisc images, or based on a proprietary format.

One or more of the images 132A-C may represent a chain of volumescomprising one or more copy-on-write (COW) volumes (which may also bereferred to as “layers”). From the perspective of node 120, the volumesmay appear as a single image. Initially, an image may comprise one rawor COW volume, which may be made read-only before the first boot of thevirtual machine. An attempt to write to a disk by a virtual machine maymodify the image or may trigger adding a new COW volume (“layer”) to thevolume chain. The newly created volume may store disk blocks or filesthat have been modified or newly created by the virtual machine afterthe previous volume (“layer”) has been made read-only. One or morevolumes may be added to the volume chain during the lifetime of thevirtual machine. In some implementations, making the previous volumeread-only (e.g., responsive to receiving a command via an administrativeinterface) triggers adding of a new COW volume. The virtual disk deviceimplemented by the hypervisor locates the data by accessing,transparently to the virtual machine, each volume of the chain ofvolumes, starting from the most recently added volume.

Image repository 130 may include a proxy agent image and a target image.In the example shown in FIG. 1, the proxy agent image may be a containerimage and may be run within container 124A on node 120. In otherexamples, proxy agent image may be a virtual machine image and may berun as a virtual machine on node 120. In either example, the proxy agentimage (e.g., image 132) may be retrieved and loaded onto node 120 andprovide the functionality of proxy agent 128.

Proxy agent 128 may interact with image analysis component 112 ofmanager 110 and may access one or more images on node 120, imagerepository 130, or a combination thereof. Proxy agent 128 may beadvantageous because manager 110 may not have access to the images dueto existing security measures but may initiate proxy agent 128 on a nodewith access to the images. In one example, proxy agent 128 may retrievethe image 132B from image repository 130 and may store image 132Blocally as target image 132 (e.g., dormant image). In another example,target image 132 may already be executing on node 120 and image analysiscomponent 112 may initiate proxy agent 128 on node 120 so that proxyagent 128 has access to target image 132 while it is running. In eitherexample, proxy agent 128 may receive requests 114 from image analysiscomponent 112 and may access the target image 132 to gather image data134 from the target image 132 and send it to the image analysiscomponent 112. The features of image analysis component 112, request114, and image data 134 will be discussed in more detail in regards toFIG. 2.

FIG. 2 is a block diagram illustrating example components and modules ofsystem 200, in accordance with one or more aspects of the presentdisclosure. System 200 may be the same or similar to manager 110 of FIG.2 and may include a data store 220 and an image analysis component 112.Image analysis component 112 may include a proxy agent initiation module212, a request module 214, an image data receiving module 216, and astate determination module 218.

Proxy agent initiation module 212 may communicate with one or more nodesto initiate a proxy agent on one of the nodes. Proxy agent initiationmodule 212 may identify a node based on whether the node has access to aparticular image (e.g., dormant image or running image) that the systemhas targeted for analysis. In one example, the proxy agent initiationmodule 212 may know which images are stored on which image repositoriesbut may not be able to access the target image because of enhancedsecurity (e.g., access restrictions). In another example, the proxyagent initiation module 212 may know details about the images but maynot know where the image is stored and which nodes have access to targetimage. In this latter situation, the proxy agent initiation module 212may communicate with one or more of the nodes to determine which nodehas access to the target image.

After identifying a node with access to the image, proxy agentinitiation module 212 may send a signal to initiate the proxy agent onthe identified node. In one example, the signal may be sent to the nodeor to a particular container hosted by the node. In another example, thesignal may be sent to an intermediate node or other device (e.g.,scheduler, manager) which may propagate the signal to the identifiednode. The signal may identify a particular image (e.g., container image)that includes the proxy agent or application that is functioning as theproxy agent. The particular image may already be on the node or may bestored in an image repository that is the same or different from theimage repository that stores the target image being analyzed. The imagethat includes the proxy agent may be launched or initiated within acontainer. The container may have existed prior to receiving the signalor may be generated after or in response to the signal. Once the proxyagent is running, it may receive one or more requests from requestmodule 214.

Request module 214 may transmit requests to the proxy agent running onthe node and the requests may include data for analyzing a target imageaccessible to the node. The requests may be the same as request 114 ofFIG. 1 and may include data comprising operations, identificationinformation, other data, or a combination thereof. The operations may beany command, procedure, instruction, action, or combination thereof forreading, writing, or executing data accessible by the proxy agent. Theoperation may be a file system or database operation or may beinterpreted or translated to a corresponding file system or databaseoperation. The identification information may be used to determine aparticular target image or a particular portion of a target image. Theparticular target image may originate from a data store (e.g., imagerepository) that is local to the node or remote from the node and theidentification information may include an image repository identifier,an image location identifier (e.g., URL, system path), an imageidentifier (e.g., file name, UUID), other image identificationinformation, or a combination thereof.

The identification information may also or alternatively be used todetermine a particular portion of a target image. The particular portionof the target image may be a data structure (e.g., file, record) withinthe image or a specific entry (e.g., parameter, setting) within the datastructure. In one example, the identification information may be used toidentify a particular data structure (e.g., system file) for a webservice, application service, database service, other computing service,or a combination thereof. In another example, the identificationinformation may be used to identify a type of data structure such as anyconfiguration file (e.g., filename.conf/.cnf/.cfg), log file (e.g.,filename.log), other data types of data structures or a combinationthereof. In either example, the proxy agent may access, search, or scanthe image to identify the particular portion of the target image in viewof the identification image.

In one example, request module 214 may transmit the one or more requestsusing a Web Distributed Authoring and Versioning (WebDAV) protocol.WebDAV may be an extension to the hypertext transfer protocol (HTTP)that allows clients to perform remote web content authoring operations.The WebDAV protocol may enable image analysis component 112 to access(e.g., read), and/or modify (e.g., change, create) data structures ofthe target image using the proxy agent running on the node. In oneexample, the request module may transmit one or more requests withinformation for identifying a particular target image and a particularportion of the target image using the WebDAV protocol. The proxy agentmay retrieve (e.g., copy, download, stream) the target image from animage repository in response to the one or more requests. The proxyagent may access and scan the target image in response to the request toidentify data and may transmit the data to image data receiving module216.

Image data receiving module 216 may receive the data from the targetimage via the proxy agent and may store the data within a data store(e.g., memory, hard disk) as image data 222. Image data 222 may includetextual data, binary data, or other data from an image or about animage. The textual data may be numeric, alphanumeric, other character,or combination thereof. The binary data may be executable code,non-executable code or a portion thereof. The textual or binary data mayindicate a state of an image and may be accessed by state determinationmodule 218.

State determination module 218 may analyze image data 222 to determinethe state of the target image. The state may be stored as state data 224within data store 220 and may be associated with the particular targetimage. State data 224 may indicate information about one or morecomputer programs, hardware components, configurations, versions,credentials, or other computing features associated with the targetimage. The computer programs may relate to operating systems (OS),kernels, drivers, middleware, applications, or other programs installedon the image. The hardware components may include physical hardware orvirtual hardware (e.g., emulated hardware) such as processors (e.g.,CPU, GPU), memory (e.g., main memory), persistent storage (e.g., harddrive), interface adapters (e.g., network interface card, graphicsinterface card), other hardware, or a combination thereof. Theconfigurations and versions may indicate the functions, features, orsettings associated with the computer programs or hardware. For example,the configurations may indicate the network information associated withthe image, such as the network identifiers (e.g., IP address, MACaddress, Ports) associated with a hardware adapter or computer programof the image. The version information may include version names,numbers, or other information such as security patches, hotfixes, orupdates associated with the image. The credentials may include accounts(e.g., user accounts) that are associated with the image. The accountmay be related to the creation of the image, instantiation of the image,logging into the image, other association, or a combination thereof.State determination module 218 may derive state data 224 from image data222 using image analysis logic 226.

Image analysis logic 226 may include one or more rules that analyze,filter, or compare the image data 222 to other data. In one example,image data 222 may include a system configuration file and the imageanalysis logic 226 may include information for analyzing the structureof the configuration file and for identifying a specific parameter todetermine a state of the image (e.g., kernel feature). In anotherexample, image data 222 may include a portion of a binary file (e.g.,hash) and the image analysis logic 226 may analyze the portion of thebinary file to identify the type or version of the binary file. Forexample, the image analysis logic 226 may compare the portion of thebinary file to fingerprints of known binary files. Image analysis logic226 may also include logic for detecting flaws in the image data 222 orstate data 224. The image flaw may be associated with amisconfiguration, a vulnerability, or a performance degradation. Theimage flaw may be stored as image flaw data 228 and associated with thetarget image.

Image analysis logic 226 may be updated before, during, and after it isapplied to image data 222 and may subsequently be re-applied to imagedata 222 to update state data 224, image flaw data 228, or a combinationthereof. This may enable image analysis component 112 to request imagedata 222 from the proxy agent a single time and evaluate the image data222 multiple times using the image analysis logic 226 (e.g., periodicevaluations). This may be advantageous because image analysis logic 226may be updated to identify additional states (e.g., new features) orflaws (e.g., security vulnerabilities) and image analysis logic 226 maybe applied to the image data 222 without re-launching the proxy agent orre-requesting the data from a remote image. In one example, the imageanalysis component 112 and image analysis logic 226 may be consolidatedto a centralized device (e.g., virtualization server or orchestrationserver) so that the logic can be updated at a single location as opposedto updating each of the proxy agents.

FIGS. 3 and 4 depict flow diagrams for illustrative examples of methods300 and 400 for analyzing the state of computing images. Methods 300 maybe executed by a server device and method 400 may be executed by aclient device. Method 300 and 400 may be performed by processing devicesthat comprise hardware (e.g., circuitry, dedicated logic), computerreadable instructions (e.g., run on a general purpose computer system ora dedicated machine), or a combination of both. Methods 300 and 400 andeach of their individual functions, routines, subroutines, or operationsmay be performed by one or more processors of the computer deviceexecuting the method. In certain implementations, methods 300 and 400may each be performed by a single processing thread. Alternatively,methods 300 and 400 may be performed by two or more processing threads,each thread executing one or more individual functions, routines,subroutines, or operations of the method.

For simplicity of explanation, the methods of this disclosure aredepicted and described as a series of acts. However, acts in accordancewith this disclosure can occur in various orders and/or concurrently,and with other acts not presented and described herein. Furthermore, notall illustrated acts may be needed to implement the methods inaccordance with the disclosed subject matter. In addition, those skilledin the art will understand and appreciate that the methods couldalternatively be represented as a series of interrelated states via astate diagram or events. Additionally, it should be appreciated that themethods disclosed in this specification are capable of being stored onan article of manufacture to facilitate transporting and transferringsuch methods to computing devices. The term “article of manufacture,” asused herein, is intended to encompass a computer program accessible fromany computer-readable device or storage media. In one implementation,methods 300 and 400 may be performed by computing device 120 or system500 as shown in FIGS. 1 and 5 respectively.

Referring to FIG. 3, method 300 may be performed by processing devicesof a computing device and may begin at block 302. At block 302, aprocessing device may initiate a proxy agent on a node, the proxy agenthaving access to an image repository comprising an image. In oneexample, the method may be performed by a management device that isunable to access the image in the image repository. The node may provideoperating system level virtualization for a container and the proxyagent may execute within the container. The container may comprisemultiple containers associated with a pod that provides an imagescanning service on behalf of the management device. Each of themultiple containers may be a user space process executing on a kernel ofthe node. The image may comprise at least one of a container image, avirtual machine image, or a disk image. In one example, the image maycomprise a file system data structure comprising a directory object anda file object.

At block 304, the processing device may transmit to the proxy agent arequest for image data of the image. The request may comprise a filesystem operation to access a portion of the image. In one example, therequest may comprise a plurality of network requests, a first networkrequest causing the proxy agent to retrieve (e.g., access, download,stream) the image from the image repository and a second network requestcausing the proxy agent to search for and identify the image data. Inanother example, the request may comprise an HTTP request comprising aWeb Distributed Authoring and Versioning (WebDAV) operation.

At block 306, the processing device may receive the image data from theproxy agent. The image may be a dormant image and the image data may beretrieved by the proxy agent without executing the image. The image datais retrieved by the proxy agent from a dormant version of the imagewhile the image is being executed and the dormant version of the imagecomprises a snapshot of the image. In one example, the image data isretrieved by the proxy agent from the image while the image is beingexecuted.

At block 308, the processing device may analyze the image data todetermine a state of the image. The state of the image comprises atleast one of a program feature, an operating system feature, or ahardware architecture feature. Responsive to completing the operationsdescribed herein above with references to block 308, the method mayterminate.

In other examples of method 300, the processing device may include oneor more additional blocks. One additional block may be associated withdetecting whether the state of the image is associated with a flaw. Theflaw comprising a misconfiguration, a vulnerability, or a performancedegradation. In response to detecting a flaw, the processing device mayperform an action, such as an action that initiates a modification ofthe image. Another additional block may be associated with updating arule for detecting that the state is associated with a flaw andre-analyzing the stored image data to determine the state of the image.

Referring to FIG. 4, method 400 may be performed by processing devicesof a computing device and may begin at block 402. Method 400 may includeoperations executed by a node device (e.g., node 120), as opposed to aserver device (e.g., manager 110). At block 402, a processing device maylaunch a proxy agent on the node and the proxy agent may comprise accessto an image repository comprising an image. The processing device maylaunch the proxy agent in response to a signal from a management devicethat is unable to directly access the image in the image repository. Inone example, the node may provide operating system level virtualizationfor a container and the proxy agent may execute within the container.The container may comprise multiple containers associated with a podthat provides an image scanning service on behalf of the managementdevice. Each of the multiple containers may be a user space processexecuting on a kernel of the node. The image may comprise at least oneof a container image, a virtual machine image, or a disk image. In oneexample, the image may comprise a file system data structure comprisinga directory object and a file object.

At block 404, the processing device may receive, from a remote device, arequest for image data of the image. The request may comprise a filesystem operation to access a portion of the image. In one example, therequest may comprise a plurality of network requests, a first networkrequest causing the proxy agent to retrieve (e.g., access, download,stream) the image from the image repository and a second network requestcausing the proxy agent to search for and identify the image data. Inanother example, the request may comprise an HTTP request comprising aWeb Distributed Authoring and Versioning (WebDAV) operation.

At block 406, the processing device may retrieve the image from theimage repository. In one example, the image may be a dormant image andthe image data may be retrieved by the proxy agent without executing theimage. In another example, the image data may be retrieved by the proxyagent from a dormant version of the image while the image is beingexecuted and the dormant version of the image may comprise a snapshot ofthe image. In yet another example, the image data may be retrieved bythe proxy agent from the image while the image is being executed.

At block 408, the processing device may transmit, by the proxy agent,the image data to the remote device to determine a state of the image.The state of the image comprises at least one of a program feature, apackage feature, an operating system feature, or a hardware architecturefeature. Responsive to completing the operations described herein abovewith references to block 408, the method may terminate.

FIG. 5 depicts a block diagram of a computer system operating inaccordance with one or more aspects of the present disclosure. Invarious illustrative examples, computer system 500 may correspond tomanager 110 or node 120 of FIG. 1. The computer system may be includedwithin a data center that supports virtualization. Virtualization withina data center results in a physical system being virtualized usingvirtual machines to consolidate the data center infrastructure andincrease operational efficiencies. A virtual machine (VM) may be aprogram-based emulation of computer hardware. For example, the VM mayoperate based on computer architecture and functions of computerhardware resources associated with hard disks or other such memory. TheVM may emulate a physical computing environment, but requests for a harddisk or memory may be managed by a virtualization layer of a computingdevice to translate these requests to the underlying physical computinghardware resources. This type of virtualization results in multiple VMssharing physical resources.

In certain implementations, computer system 500 may be connected (e.g.,via a network, such as a Local Area Network (LAN), an intranet, anextranet, or the Internet) to other computer systems. Computer system500 may operate in the capacity of a server or a client computer in aclient-server environment, or as a peer computer in a peer-to-peer ordistributed network environment. Computer system 500 may be provided bya personal computer (PC), a tablet PC, a set-top box (STB), a PersonalDigital Assistant (PDA), a cellular telephone, a web appliance, aserver, a network router, switch or bridge, or any device capable ofexecuting a set of instructions (sequential or otherwise) that specifyactions to be taken by that device. Further, the term “computer” shallinclude any collection of computers that individually or jointly executea set (or multiple sets) of instructions to perform any one or more ofthe methods described herein.

In a further aspect, the computer system 500 may include a processingdevice 502, a volatile memory 504 (e.g., random access memory (RAM)), anon-volatile memory 506 (e.g., read-only memory (ROM) orelectrically-erasable programmable ROM (EEPROM)), and a data storagedevice 516, which may communicate with each other via a bus 508.

Processing device 502 may be provided by one or more processors such asa general purpose processor (such as, for example, a complex instructionset computing (CISC) microprocessor, a reduced instruction set computing(RISC) microprocessor, a very long instruction word (VLIW)microprocessor, a microprocessor implementing other types of instructionsets, or a microprocessor implementing a combination of types ofinstruction sets) or a specialized processor (such as, for example, anapplication specific integrated circuit (ASIC), a field programmablegate array (FPGA), a digital signal processor (DSP), or a networkprocessor).

Computer system 500 may further include a network interface device 522.Computer system 500 also may include a video display unit 510 (e.g., anLCD), an alphanumeric input device 512 (e.g., a keyboard), a cursorcontrol device 514 (e.g., a mouse), and a signal generation device 520.

Data storage device 516 may include a non-transitory computer-readablestorage medium 524 on which may store instructions 526 encoding any oneor more of the methods or functions described herein, includinginstructions for implementing methods 300 or 400 and for encoding imageanalysis component 112 and modules illustrated in FIG. 2.

Instructions 526 may also reside, completely or partially, withinvolatile memory 504 and/or within processing device 502 during executionthereof by computer system 500, hence, volatile memory 504 andprocessing device 502 may also constitute machine-readable storagemedia.

While computer-readable storage medium 524 is shown in the illustrativeexamples as a single medium, the term “computer-readable storage medium”shall include a single medium or multiple media (e.g., a centralized ordistributed database, and/or associated caches and servers) that storethe one or more sets of executable instructions. The term“computer-readable storage medium” shall also include any tangiblemedium that is capable of storing or encoding a set of instructions forexecution by a computer that cause the computer to perform any one ormore of the methods described herein. The term “computer-readablestorage medium” shall include, but not be limited to, solid-statememories, optical media, and magnetic media.

The methods, components, and features described herein may beimplemented by discrete hardware components or may be integrated in thefunctionality of other hardware components such as ASICS, FPGAs, DSPs orsimilar devices. In addition, the methods, components, and features maybe implemented by firmware modules or functional circuitry withinhardware devices. Further, the methods, components, and features may beimplemented in any combination of hardware devices and computer programcomponents, or in computer programs.

Unless specifically stated otherwise, terms such as “initiating,”“transmitting,” “receiving,” “analyzing,” or the like, refer to actionsand processes performed or implemented by computer systems thatmanipulates and transforms data represented as physical (electronic)quantities within the computer system registers and memories into otherdata similarly represented as physical quantities within the computersystem memories or registers or other such information storage,transmission or display devices. Also, the terms “first,” “second,”“third,” “fourth,” etc. as used herein are meant as labels todistinguish among different elements and may not have an ordinal meaningaccording to their numerical designation.

Examples described herein also relate to an apparatus for performing themethods described herein. This apparatus may be specially constructedfor performing the methods described herein, or it may comprise ageneral purpose computer system selectively programmed by a computerprogram stored in the computer system. Such a computer program may bestored in a computer-readable tangible storage medium.

The methods and illustrative examples described herein are notinherently related to any particular computer or other apparatus.Various general purpose systems may be used in accordance with theteachings described herein, or it may prove convenient to construct morespecialized apparatus to perform methods 300, 400 and/or each of itsindividual functions, routines, subroutines, or operations. Examples ofthe structure for a variety of these systems are set forth in thedescription above.

The above description is intended to be illustrative, and notrestrictive. Although the present disclosure has been described withreferences to specific illustrative examples and implementations, itwill be recognized that the present disclosure is not limited to theexamples and implementations described. The scope of the disclosureshould be determined with reference to the following claims, along withthe full scope of equivalents to which the claims are entitled.

What is claimed is:
 1. A method comprising: initiating, by a processingdevice, a proxy agent on a node, the proxy agent having access to animage repository comprising an image; transmitting to the proxy agent arequest for image data of the image; receiving the image data from theproxy agent; and analyzing the image data to determine a state of theimage.
 2. The method of claim 1, wherein the node provides operatingsystem level virtualization for a container and the proxy agent executeswithin the container.
 3. The method of claim 1, wherein the image is adormant image and the image data is retrieved without executing theimage.
 4. The method of claim 1, wherein the image data is retrievedfrom a dormant version of the image while the image is being executed,the dormant version of the image comprising a snapshot of the image. 5.The method of claim 1, wherein the image data is retrieved from theimage while the image is being executed.
 6. The method of claim 1,wherein the image comprises at least one of a container image, a virtualmachine image, or a disk image.
 7. The method of claim 1, wherein therequest comprises a plurality of network requests, wherein the pluralityof network requests comprises a first network request to retrieve theimage from the image repository and a second network request to identifythe image data.
 8. The method of claim 1, wherein the request comprisesa file system operation to access a portion of the image.
 9. The methodof claim 1, wherein the request comprises an HTTP request comprising aWeb Distributed Authoring and Versioning (WebDAV) operation.
 10. Themethod of claim 1, wherein the state of the image comprises at least oneof a computer program feature, an operating system feature, or ahardware feature.
 11. The method of claim 1, further comprising:detecting whether the state of the image is associated with a flaw, theflaw comprising a misconfiguration, a vulnerability, or a performancedegradation; and performing an action in view of the state, wherein theaction initiates a modification of the image.
 12. The method of claim 1,further comprising: storing the image data received from the proxyagent; updating a rule for detecting that the state is associated with aflaw; and re-analyzing the stored image data to determine the state ofthe image.
 13. The method of claim 1, wherein the container comprisesmultiple containers associated with a pod providing an image scanningservice, wherein each of the multiple containers comprises a user spaceprocess executing on a kernel of the node.
 14. The method of claim 1,wherein the processing device is part of a management device that isunable to access the image in the image repository.
 15. The method ofclaim 1, wherein the image comprises a file system data structurecomprising a directory object and a file object.
 16. A systemcomprising: a memory; a processing device operatively coupled to thememory, the processing device to: initiate a proxy agent on a node, theproxy agent having access to an image repository comprising an image;transmit to the proxy agent a request for image data of the image;receive the image data from the proxy agent; and analyze the image datato determine a state of the image.
 17. The system of claim 16, whereinthe node provides operating system level virtualization for a containerand the proxy agent executes within the container.
 18. The system ofclaim 16, wherein the image is a dormant image and the image data isretrieved without executing the image.
 19. A non-transitorymachine-readable storage medium storing instructions that cause aprocessing device to: launch a proxy agent on a node, the proxy agentcomprising access to an image repository comprising an image; receive,from a remote device, a request for image data of the image; retrievethe image from the image repository; and transmit, by the proxy agent,the image data to the remote device to determine a state of the image.20. The non-transitory machine-readable storage medium of claim 19,wherein the image is a dormant image and the image data is retrievedwithout executing the image.